Skip to content

Conversation

@ambrad
Copy link
Member

@ambrad ambrad commented Oct 19, 2025

The atmosphere semi-Lagrangian tracer transport method's enhanced trajectory method (ETM) is a stealth feature intended to make tracer transport time steps even longer and, in any case, more flexible.

This PR completes the C++/Kokkos version of the ETM, in particular support for semi_lagrange_trajectory_nvelocity > 2.

It also modifies the vertical discretization relative to the original ETM implementation, in light of issues at and around the tropopause when using the ETM in test v3 simulations. This new approach resolves the tropopause issue.

Finally, this PR strengthens the homme_integration test suite as follows:

  • add a new planar tracer transport test with analytical solution at the end time, with space- and time-varying surface pressure;
  • add instances of this test to the homme_integration test suite;
  • add a cmake tool to check error output for the recently added sphere and new plane tracer transport tests when running this suite;
  • activate the Kokkos builds in the non-BFB part of homme_integration, turning on the parts of the unit tests that are meaningful in the non-BFB build and the recently added and new tracer transport tests.

[non-BFB] new tests and existing tests of the ETM
[BFB] all other tests

@ambrad
Copy link
Member Author

ambrad commented Oct 19, 2025

Testing is finished:

  • chrysalis, intel, baselines:
    • e3sm_integration:
      • expected diff: ERS.ne4pg2_oQU480.F2010.chrysalis_intel.eam-thetahy_sl_nsubstep2
      • fail matches cdash: moab test
    • homme_integration
      • BFB and optimized: expected transport-related diffs or lack of baselines
    • e3sm_eamxx_v1
      • expected sl_nsubstep2 diff
      • otherwise matches cdash: mostly pass except two fails
    • SMS_Ld1.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.allactive-wcprod_1850
      • pass
  • pm-cpu, gnu, baselines: e3sm_developer
    • same as cdash
  • pm-cpu, intel, baselines: e3sm_integration
    • expected diff for sl_nsubstep2, otherwise matches cdash
  • frontier, craycray-mphipcc: e3sm_eamxx_v1_medres, e3sm_eamxx_v1_hires
    • same as cdash
  • pm-gpu, gnugpu, baselines: e3sm_eamxx_v1
    • expected diff for sl_nsubstep2, otherwise matches cdash
  • aurora, oneapi-ifxgpu: e3sm_eamxx_v1_release
    • same as cdash

<hypervis_subcycle_q hgrid=".*pg2">6</hypervis_subcycle_q>
<transport_alg hgrid=".*pg2">12</transport_alg>
<semi_lagrange_trajectory_nsubstep>0</semi_lagrange_trajectory_nsubstep>
<semi_lagrange_trajectory_nvelocity>-1</semi_lagrange_trajectory_nvelocity>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These permit automatic defaults to be computed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding the doc="..." string to these parameters, with a brief explanation? To help users know if/when they want to modify these params. Also, if some params have a limited set of valid values, you could also set the valid_values="a,b,c" metadata as well.

ATMCHANGE=$CIMEROOT/../components/eamxx/scripts/atmchange

$ATMCHANGE semi_lagrange_trajectory_nsubstep=2 -b
$ATMCHANGE semi_lagrange_trajectory_nvelocity=3 -b
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update a test now that nvelocity > 2 is implemented for EAMxx.

#AOT flags
SET(SYCL_COMPILE_FLAGS "-std=c++17 -fsycl -fsycl-device-code-split=per_kernel -fno-sycl-id-queries-fit-in-int -fsycl-unnamed-lambda -Xclang -fsycl-allow-virtual-functions")
#SET(SYCL_COMPILE_FLAGS "-std=c++17 -fsycl -fsycl-device-code-split=per_kernel -fno-sycl-id-queries-fit-in-int -fsycl-unnamed-lambda -Xclang -fsycl-allow-virtual-functions")
SET(SYCL_COMPILE_FLAGS "-std=c++17 -fsycl -fsycl-device-code-split=per_kernel -fno-sycl-id-queries-fit-in-int -fsycl-unnamed-lambda")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for standalone-Homme builds on Aurora. It looks like -Xclang -fsycl-allow-virtual-functions is now neither supported nor needed. I'm leaving the previous line commented out in case another compiler version change reinstates this flag and its necessity.

SET (ADD_Fortran_FLAGS "-traceback" CACHE STRING "")
SET (ADD_C_FLAGS "-traceback" CACHE STRING "")
SET (ADD_CXX_FLAGS "-traceback" CACHE STRING "")
SET (BUILD_HOMME_THETA_KOKKOS TRUE CACHE BOOL "")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enables C++/Kokkos testing in the optimized test of homme_integration. Previously, only the BFB test would build the C++/Kokkos components. The optimized test will now run some unit tests and tracer transport tests that have analytical solutions and checks on those solutions.

@@ -0,0 +1,39 @@
# This utility checks output of the form
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support homme_integration transport-test error checking in the case of known solutions.


IF (HOMME_ENABLE_COMPOSE)
LIST(APPEND HOMME_TESTS
thetanh-moist-bubble-sl.cmake
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to rearrange some lists to support running C++/Kokkos tests in the optimized homme_integration test in addition to the BFB test.

createTests(HOMME_TESTS)

IF (${BUILD_HOMME_PREQX_KOKKOS})
IF (HOMMEXX_BFB_TESTING AND BUILD_HOMME_PREQX_KOKKOS)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More changes to support C++/Kokkos in the homme_integraiton optimized test.

@@ -0,0 +1,3 @@
THETAL_KOKKOS_SETUP()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for the new planar-transport test.

@ambrad ambrad added HOMME HOMME standalone issues with the standalone HOMME code that dont impact E3SM EAMxx C++ based E3SM atmosphere model (aka SCREAM) non-BFB PR makes roundoff changes to answers. labels Oct 19, 2025
@ambrad ambrad self-assigned this Oct 19, 2025
@ambrad ambrad requested review from bartgol and mt5555 October 19, 2025 01:17
#ifndef CAM
#include "config.h"

module planar_transport_tests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New planar-transport test, paralleling the sphere tests. Just like the recently added sphere test, surface pressure varies in space and time.

test_case(1:13)== "jw_baroclinic" .or. &
test_case(1:5) == "dcmip" .or. &
test_case(1:5) == "mtest" .or. &
test_case == "planar_hydro_gravity_wave" .or. &
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean up needless checks. This follows the pattern of the other test sets: check just the prefix.

@ambrad ambrad added the CI: approved Allow gh actions PR testing on ghci-snl-* machines label Oct 19, 2025
@ambrad
Copy link
Member Author

ambrad commented Oct 20, 2025

@bartgol I opened this PR for a branch on my fork. Is there something special I need to do to get CI to work? Thanks.

@bartgol
Copy link
Contributor

bartgol commented Oct 20, 2025

@bartgol I opened this PR for a branch on my fork. Is there something special I need to do to get CI to work? Thanks.

Gitlab-ex is having some SSL certificate issue, which is affecting all AT2 customers as well. Hopefully it will get fixed soon. Once it's back online, I will make sure testing runs on this PR.

Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I only have a minor request, for the sake of users.

<hypervis_subcycle_q hgrid=".*pg2">6</hypervis_subcycle_q>
<transport_alg hgrid=".*pg2">12</transport_alg>
<semi_lagrange_trajectory_nsubstep>0</semi_lagrange_trajectory_nsubstep>
<semi_lagrange_trajectory_nvelocity>-1</semi_lagrange_trajectory_nvelocity>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind adding the doc="..." string to these parameters, with a brief explanation? To help users know if/when they want to modify these params. Also, if some params have a limited set of valid values, you could also set the valid_values="a,b,c" metadata as well.

@ambrad
Copy link
Member Author

ambrad commented Oct 26, 2025

The CI testing looks good. The CI EAMxx tests are failing with NLFAIL due to expected

Differences in namelist 'ctl_nl':
  found extra variable: 'semi_lagrange_trajectory_nvelocity'
  found extra variable: 'semi_lagrange_halo'
  found extra variable: 'semi_lagrange_diagnostics'

This helpful message from the CI

NLFAIL <testname> (but otherwise OK) RUN

shows that this is the only reason for the failure; I checked that this indeed is true in a downloaded artifact for one of the tests.

@ambrad
Copy link
Member Author

ambrad commented Oct 27, 2025

I'll merge this when next is stable.

@ambrad
Copy link
Member Author

ambrad commented Oct 29, 2025

There are conflicts with next. After these PRs go into master, I'll rebase this branch on master and then try again to integrate this branch.

@bartgol
Copy link
Contributor

bartgol commented Oct 29, 2025

There are conflicts with next. After these PRs go into master, I'll rebase this branch on master and then try again to integrate this branch.

Ah, sorry, it may have been my PR that removed old unused targets.

@ambrad
Copy link
Member Author

ambrad commented Oct 29, 2025

Ah, sorry, it may have been my PR that removed old unused targets.

Sure, that's fine. I was just noting what I'm waiting for.

The atmosphere semi-Lagrangian tracer transport method's enhanced trajectory
method (ETM) is a stealth feature intended to make tracer transport time steps
even longer and, in any case, more flexible.

This PR completes the C++/Kokkos version of the ETM, in particular support for
semi_lagrange_trajectory_nvelocity > 2.

It also modifies the vertical discretization relative to the original ETM
implementation, in light of issues at and around the tropopause when using the
ETM in test v3 simulations. This new approach resolves the tropopause issue.

Finally, this PR strengthens the homme_integration test suite as follows:
* add a new planar tracer transport test with analytical solution at the end
  time, with space- and time-varying surface pressure;
* add instances of this test to the homme_integration test suite;
* add a cmake tool to check error output for the recently added sphere and new
  plane tracer transport tests when running this suite;
* activate the Kokkos builds in the non-BFB part of homme_integration, turning
  on the parts of the unit tests that are meaningful in the non-BFB build and
  the recently added and new tracer transport tests.
@ambrad ambrad force-pushed the ambrad/homme/isl2pr2 branch from 61340a2 to 4a4c944 Compare October 30, 2025 21:45
ambrad added a commit that referenced this pull request Nov 4, 2025
Homme(xx)/SL: Finish C++/Kokkos for ETM; modify vertical discretization.

The atmosphere semi-Lagrangian tracer transport method's enhanced trajectory
method (ETM) is a stealth feature intended to make tracer transport time steps
even longer and, in any case, more flexible.

This PR completes the C++/Kokkos version of the ETM, in particular support for
semi_lagrange_trajectory_nvelocity > 2.

It also modifies the vertical discretization relative to the original ETM
implementation, in light of issues at and around the tropopause when using the
ETM in test v3 simulations. This new approach resolves the tropopause issue.

Finally, this PR strengthens the homme_integration test suite as follows:
* add a new planar tracer transport test with analytical solution at the end
  time, with space- and time-varying surface pressure;
* add instances of this test to the homme_integration test suite;
* add a cmake tool to check error output for the recently added sphere and new
  plane tracer transport tests when running this suite;
* activate the Kokkos builds in the non-BFB part of homme_integration, turning
  on the parts of the unit tests that are meaningful in the non-BFB build and
  the recently added and new tracer transport tests.

[non-BFB] new tests and existing tests of the ETM
[BFB] all other tests
@ambrad
Copy link
Member Author

ambrad commented Nov 4, 2025

Holding on next until #7791 is resolved.

@ambrad ambrad merged commit 223ac92 into E3SM-Project:master Nov 5, 2025
14 of 20 checks passed
@ambrad
Copy link
Member Author

ambrad commented Nov 5, 2025

@bartgol what do I need to do to clear the NLFAILs in the EAMxx test baselines used for the autotester? Or is that automatic when the PR goes into master? Thanks.

@bartgol
Copy link
Contributor

bartgol commented Nov 5, 2025

@bartgol what do I need to do to clear the NLFAILs in the EAMxx test baselines used for the autotester? Or is that automatic when the PR goes into master? Thanks.

You need to go on the actions tab at the top of the page, and look for the eamxx-v1 workflow on the left. Then, on the top right, click on the drop down menu "Run workflow". It will open a small menu, where you can select the branch from which the workflow should run, the jobs to run, and whether to gen baselines. You should leave 'master' for the branch, ofc, and check the "generate baselines" box. The jobs to run is currently irrelevant, as there is only machine were we run v1 tests with gh actions (the "all" option does the same thing as "cpu-gcc").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: approved Allow gh actions PR testing on ghci-snl-* machines EAMxx C++ based E3SM atmosphere model (aka SCREAM) HOMME standalone issues with the standalone HOMME code that dont impact E3SM HOMME non-BFB PR makes roundoff changes to answers.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants